Methods in Ecology and Evolution
○ Wiley
Preprints posted in the last 30 days, ranked by how well they match Methods in Ecology and Evolution's content profile, based on 160 papers previously published here. The average preprint has a 0.16% match score for this journal, so anything above that is already an above-average fit.
Ma, Z.; Ellison, A. M.
Show abstract
O_LIDiversity and heterogeneity are related but distinct and often conflated concepts. Diversity quantifies the number or relative abundance of discrete objects (e.g. species), whereas heterogeneity includes interactions among them (i.e. in networks) and between them and their environments. Although estimation, testing, and inference of diversity is well established and understood in ecology, comparable methods for heterogeneity are themselves diverse and rarely applied consistently or coherently. C_LIO_LIWe propose a consistent and coherent methodology for estimation, testing, and inference of heterogeneity of ecological networks. Estimation of heterogeneity is scalable from individuals to populations using the variance-to-mean (V/M) ratio and extensions of Taylors power law (TPL) to analyzing networks. Bootstrapping is used to partition heterogeneous and random clusters, whereas permutation tests are used to compare individual- and network-level heterogeneity. Inference includes the identification of "important" (e.g. dominant, foundation, keystone) species and "rich clubs" in heterogeneous networks, detection of biomarkers, and analysis of heterogeneity-stability relationships. C_LIO_LIWe demonstrate this methodology using the global Earth Microbiome Project dataset. The method could reliably distinguish heterogeneous nodes and networks; identified significant differences in heterogeneity among microbial assemblages in different habitats and in specific sites within habitats; and supported established principles of host filtering, species sorting, and niche partitioning. C_LIO_LIOur methods for estimation, testing, and inference of heterogeneity are modular, scalable, and applicable to a wide range of ecological systems. They also provide a quantitative method for understanding how evolutionary and ecological forces jointly shape both topology and heterogeneity in ecological networks. C_LI
De Marco, R.
Show abstract
This paper presents a six-stage methodological framework for Convolutional Neural Net-work (CNN)-based cetacean vocalization detection and classification in Passive Acoustic Monitoring (PAM), implemented as the open-source toolkit ai-pam-pipeline. The frame-work is generalizable across species and fully parameterised through a single configuration file, guaranteeing exact experimental reproducibility. Two experiments are reported. Experiment A examines the effect of FFT window length Nfft [isin] {256, 512, 1024} on binary Bottlenose dolphin (Tursiops truncatus) whistle detection using stratified 10-fold cross-validation on an in-domain dataset (Oltremare, 192 kHz) and a cross-domain benchmark (DCLDE 2022). In-domain performance is uniformly high (macro F1{approx} 0.98; Wilcoxon, all p > 0.05). Cross-domain results diverge substantially: Nfft = 256 is significantly superior (p = 0.006, rank-biserial r = 0.89). The mechanism is an upsampling amplification effect: coarser spectral bins produce wider, higher-contrast FM traces after bilinear resampling to fixed image dimensions. This superiority is threshold-invariant: precision equals 1.000 across all configurations and thresholds{theta} [isin] [0.1, 0.9], confirming that the advantage is not an artifact of threshold choice. These findings demonstrate that preprocessing choices -- often treated as secondary implementation details -- can significantly affect cross-domain generalisation. While Nfft serves here as a controlled case study, the framework is designed to enable systematic, reproducible evaluation of arbitrary preprocessing parameters within a unified experimental protocol. Experiment B demonstrates multiclass capability on five T. truncatus vocalization cate-gories (macro F1 = 0.843); inter-class confusion between click trains and burst-pulse sounds reflects biological signal overlap rather than classifier failure.
Kadlec, I.; Bartak, V.; Selimovic, A.; Kutal, M.; Dula, M.; Stier, N.; Meissner-Hylanova, V.; Peskova, L. B.; Sladecek, M.; Vorel, A.; Signer, J.
Show abstract
O_LIClassifying animal movement strategies from GPS tracking data is essential for understanding space use, population dynamics and conservation planning. However, existing approaches either require strong parametric assumptions about trajectory shape, large labelled datasets (i.e. expert-annotated) for machine learning, or lack formal uncertainty quantification. These limitations create barriers for researchers working with novel species or limited sample sizes. C_LIO_LIWe present a profile-based classification framework consisting of three steps. First, trajectories are segmented using breakpoint detection applied to Net Squared Displacement (NSD) time series. Movement metrics are then extracted from each segment and classified by comparing them to empirically derived behavioural profiles via Z-score distances transformed to softmax probabilities. Bootstrap resampling quantifies uncertainty in the resulting classifications from both training and test data. We validated the framework through simulation experiments and applied it to GPS tracking data from two ecologically contrasting species: gray wolf (Canis lupus;43 individuals) and northern lapwing (Vanellus vanellus;15 individuals). C_LIO_LISimulations showed that 5-10 training segments per movement strategy suffice for reliable classification, with overall accuracy of 91.1%across residential, floating and dispersal strategies. Segment duration of 30-60 days was required for confident discrimination of residential and floating behaviour. For wolves, the framework clearly distinguished residency, floating or dispersal (91.2%of segments classified with >50%probability). For lapwings, migration was identified with high confidence, while residential-floating discrimination reflected genuine ecological ambiguity confirmed by domain experts, with bootstrap confidence intervals transparently flagging uncertain cases. C_LIO_LIThe profile-based framework provides an accessible, interpretable alternative to parametric NSD fitting and machine learning approach, requiring modest training data while delivering probabilistic classifications with honest uncertainty estimates. An R package (moveprofile) implementing the complete workflow is freely available. The framework is applicable to any tracked species where distinct movement strategies can be identified by experts knowledge. C_LI
Smith, T. Q.; Szpiech, Z. A.
Show abstract
Pattersons D statistic, also known as the ABBA-BABA statistic, is widely used to detect the presence of archaic genome-wide introgression between two non-sister taxa. Requiring only a single lineage from each of four taxa where one taxon acts as an outgroup to determine the ancestral allele, Pattersons D, counts the imbalance between the number of biallelic sites where either the second and third taxa (ABAB site) or the first and third taxa (BABA site). When there is no introgression, these counts are expected to be equal, and a discordance between counts suggests introgression from the third taxon into either the first or second. Pattersons D is limited to the detection of genome-wide introgression and exhibits a high false-positive rate when applied to smaller genomic segments. Here, we present a new method, D STatistic with Allelic Rarefaction (D*), to address these limitations. D* uses multiple lineages and does not require an outgroup to calculate the imbalance between the number of alleles found exclusively in the second and third taxa and the number of alleles found exclusively in the first and third taxa. D* employs a rarefaction technique to correct for unequal sample-size and allows multiallelic sites. We use simulations to show that D* has better precision and recall for detecting introgressed segments of DNA when compared to similar methods under a wide variety of model parameters and in the presence of technical artifacts common to ancient DNA analyses. We conclude with an analysis of Denisovan DNA introgression in modern day Papuans. Precompiled executables, the manual, and source code can be found at https://github.com/TQ-Smith/DSTAR
Swiston, S. K.; Kuehne, L.; Moore, R.; Landis, M. J.
Show abstract
Computational workshops are common in evolutionary biology and are used to share discipline-specific tools and skills with researchers. Despite the perceived importance of these workshops, there is no common set of criteria for workshop success, and there are few peer-reviewed studies investigating the efficacy of workshops or assessing the value of particular instructional techniques in this context. Here, we focused on one key element of a successful workshop: its ability to increase participants motivation to use the methods and tools presented during the workshop. We analyzed the goals, perceptions, and future plans of research practitioners engaging in a workshop on phylogenetic methods of historical biogeography using pre- and post-workshop surveys. Overall, the workshop was successful at motivating participants, and survey responses provided insights into participants perceptions of different activities, including "participatory live coding". Apart from this case study, we aim to highlight the importance of developing a common set of workshop goals in collaboration with other workshop stakeholders and the need for specialized, validated tools for assessing the efficacy of computational workshops for researchers.
Dhananjanie, A.; Thompson, H.; Vercelloni, J.; Warne, D. J.
Show abstract
Explainable machine learning (ML) methods are gaining increasing attention in environmental and ecological research for their ability to reveal relationships between environmental drivers and population dynamics. However, there remain questions on the reliability of these tools, especially given recent research shows that these explanations can be highly sensitive to model architecture. In ecology, it is typical to use a single ML model, and a comparative evaluation of sensitivity of explainability for different ML approaches is overlooked. In this paper, we develop a novel framework that quantifies explanation consistency between multiple ML model architectures. This framework provides a discrepancy measure for each model prediction, with high discrepancy indicating substantive explanation disagreement across models and low discrepancy indicating strong consensus in explanations across models. We then demonstrate that low explanation discrepancy aligns well with ground truth mechanism. Furthermore, high explanation discrepancy provide a mechanism to identify areas for model refinement and further investigation by domain experts. We do this by using a simulation study based on synthetic coral cover data that incorporate spatio-temporal variability driven by known disturbance effects. Our method provides a quantitative approach to assess the sensitivity of explainable ML in the absence of ground truth. As a result, this enhances the utility of ML approaches in conservation and ecological management. While we focus primarily on ecological modelling for coral reefs, our methods are generally applicable to other ecological and environmental modelling settings.
Benner, S.; Shiono, S.; Kagawa, T.; Hattori, K.; Yamasue, H.; Lipp, H.-P.; Endo, T.
Show abstract
Long-term, automated tracking of group-housed social animals using RFID (radio frequency identification) is a promising approach in ethological neuroscience. However, low-frequency (LF) RFID, while long-established in the field, is constrained by its inherent low data rates, which lead to two critical limitations: (1) compromised spatiotemporal resolution, and (2) the inability to identify multiple tags (animals) simultaneously. To address these limitations, we developed eeeHive, a high-frequency (HF) RFID-based animal tracking system with a fully custom hardware architecture that enables high-speed, multiplexed antenna polling and concurrent multi-tag reading. The polling time per antenna in eeeHive was 5.9 ms, with an additional 8.2 ms read time per tag. We applied the system to track 24 mice for one week, and six common marmosets for seven weeks. The system successfully tracked individuals even within dense clusters, revealing complex behavioral traits characterized by spatial utilization, temporal dynamics, behavioral regularity, and inter-individual relationships. Additional tests with Japanese fire-bellied newts and Nile tilapia juveniles demonstrated comparable tracking performance in aquatic environments. Taken together, eeeHive overcomes the inherent limitations of conventional LF RFID, establishing a powerful HF RFID-based platform for fine-scale behavioral tracking of group-housed animals across terrestrial and aquatic species.
Tous, J.; Chiquet, J.
Show abstract
A major goal of community ecology lies in the deciphering of the processes underlying species distribution. A widespread approach to this question is to identify patterns in species community data and relate them to possible processes. Joint Species Distribution Models (JS-DMs) offer one way to do so through the infernece of association networks that describe patterns of statistical correlations and dependencies between species, but it is unclear what processes can explain the presence of such correlations. While it has now been established that there is no equivalence between JSDM-inferred associations and biotic interactions, the later remain one possible explanation, among others, for the former. However, to our knowledge, there is no specific study of the statistical patterns induced by different types of interactions or of the conditions under which they may or may not appear as statistical correlations / dependencies in species communities. To explore these questions, we propose a "virtual ecologist" approach that consists in simulating community data based on abiotic and biotic processes with the VirtualCom model that emulates the effects of environmental processes and of competition and facilitation interactions. Then, we study to what extent JSDMs retrieve correlations between species that match the simulated interactions. We show that these interactions are better identified when using JSDMs that model partial correlations between species rather than marginal ones. We further demonstrate how critical it is to correctly model abiotic effects in order to identify biotic ones and that the "correct modelling" of these effects depend on the type of interactions at stake.
Zogby, D. S.; Eddington, V. M.; Craig, E. C.; Kloepper, L. N.
Show abstract
Common terns (Sterna hirundo) are regionally threatened migratory seabirds that form large breeding colonies during the North American summer months. They are highly vocal and serve as important bioindicators of aquatic ecosystems. Historically, acoustic studies on colonial seabirds have proven difficult due to the dense aggregations of individuals and high rate of call overlap. However, as passive acoustic monitoring (PAM) becomes increasingly common for studying seabird colonies, quantitative descriptions of species vocalizations are needed to accurately interpret behavioral information from colony soundscapes and support automated analysis of large acoustic datasets. This study aims to quantify the vocal repertoire of adult common terns. We deployed AudioMoths to collect acoustic data at a tern colony on Seavey Island, New Hampshire, USA from across the breeding season. Using RavenPro, unique call types were identified through visual and aural inspection of the acoustic data in the spectrogram. For each call, we then extracted measurements of peak frequency (Hz), bandwidth 90% (Hz), syllable duration 90% (s), and total bout duration (s) to quantify the characteristics of each call type. Statistical analyses for acoustic parameters by call type were performed using Kruskal-Wallis tests, followed by post-hoc Dunn tests. Our results demonstrate that each call type is significantly different from another by at least one parameter, with the exception of the kek and kip/tjuk calls. These findings present the first quantitative analysis of common tern vocalizations for North America. By defining temporal and spectral characteristics for multiple call types, this work helps translate colony soundscape into biologically meaningful information about tern behavior and colony dynamics. These descriptions also provide key parameters for developing automated tools to detect and classify vocalizations in dense, noisy colonies. Integrating quantified vocal characteristics with PAM offers a promising approach for monitoring colony activity and behavior while minimizing disturbance relative to traditional methods.
Li, B.; Ane, C.
Show abstract
Phylogenetic network inference methods are increasingly used to detect hybridization and gene flow from genomic data, but their robustness to common sources of model violation remains poorly characterized. We conducted a simulation study to evaluate the effects of hidden paralogy and substitution rate variation on two widely used network inference methods: find_graphs from ADMIXTOOLS 2 and SNaQ. Using an eight-taxon species tree calibrated from an empirical reptile phylogeny, we simulated data under various levels of hidden paralogy (from none to strong) and three levels of rate variation (none, gene-specific, and lineage-specific). We found that hidden paralogy had limited impact on network inference under the conditions examined: both network methods correctly favored a tree without reticulation, and ASTRAL recovered the correct species tree every time. In contrast, lineage-specific rates severely biased find_graphs, inflating worst f-statistic residuals well beyond the standard acceptance threshold. SNaQ correctly selected a tree model almost always across all conditions, though its network with h = 1 reticulation displayed the true species tree with a lower probability under lineage-specific rates. We also show that the standard worst residuals threshold of 3 for find_graphs produces inflated type I error even without rate variation, and we recommend empirical calibration of this threshold within each study system.
Hipp, A. L.; Althaus, K. N.; Fuller, E. L.; Hahn, M.; Larson, D. A.; Mohn, R. A.; Wang, B.; Manos, P. S.
Show abstract
Forest trees pose numerous potential challenges to phylogenomic inference. Their large effective population sizes and relatively long generation times lead to deep allele coalescence and consequently incomplete lineage sorting (ILS), which biases inferences of divergence times toward older ages and introduces gene tree discordance. Deep phylogenetic divergences, reaching back into the Paleocene, introduce reference-mapping biases. Introgression--the movement of genes between lineages--may result in different phylogenies being inferred depending on which individuals are included in analysis, even if the plurality of the genome favors the divergence history unaffected by introgression. These factors influence phylogenetic inference across the Tree of Life but are particularly prevalent in forest trees. Oaks (Quercus) are notable for all three influences. In addition, our knowledge of the oak phylogeny is currently based strongly on restriction site associated DNA sequencing (RADseq) datasets published over the past decade, which may introduce additional sources of uncertainty. In this chapter, we analyze a 322-species RADseq dataset and genome resequencing data from across the genus to address sources of uncertainty in our understanding of the global oak phylogeny, which we hope will serve as a model for other research groups working on comparable woody plant groups.
Remy, E.; Carlier, A.; Massol, E.; Kacimi, R.; Chaine, A. S.; Cauchoix, M.
Show abstract
Widespread arthropod declines pose risks to ecosystem functioning and agriculture. Assessing this decline or potential remediation implies the need for standardized and scalable population monitoring. Image-based methods, including camera traps and citizen science programs, are increasingly used, but the volume of data collected requires automated analysis. Robust arthropod detection is essential for individual counting or fine-grained classification, yet current datasets and algorithms do not address the vast morphological diversity across arthropod species and often overlook the variety of photographic contexts, such as differences in background, lighting, and image composition, in which arthropods are captured. To address this gap, we developed an arthropod detection dataset, covering all terrestrial families present in France with available validated images on the iNaturalist platform (749 families). To achieve this, we employed an iterative workflow in which a YOLOv11 model pre-annotated images -- using one representative species per family-- followed by manual correction and model retraining. Repeating this process progressively reduced annotation effort and improved model accuracy. The final outcome consists of a publicly available curated detection dataset and a robust arthropod detector for natural background scenes. The detector achieves an F1-score of 0.91, demonstrating strong performance despite substantial interspecific morphological variation and heterogeneity in photographic contexts. We further demonstrated the taxonomical universality of the model showing high F1-score and IoU averaged at the class (0.79, 0.85) and order level (0.82, 0.86) and also a good detection generalizability (F1-score>0.90, IoU>0.83) on species, genera and families never encountered by the model during training. Finally, we show how this model can be improved to generalize to new datasets using data augmentation, complementary training data or fine-tuning and increase detection of small objects. In particular, we report performance of the improved models on three use cases largely used in non lethal insect monitoring: (i) diurnal pollinator monitoring through citizen science or (ii) flower and nocturnal insects monitoring through smartphone time-lapse of a UV-illuminated white panel. These results mark an important step toward automated analysis of arthropod images in natural contexts, from both large-scale automated monitoring approaches or from citizen science monitoring programs.
Sharma, P.; Kezia, K.; Seshadri, K. S.
Show abstract
Passive Acoustic Monitoring (PAM) has emerged as a transformative tool for biodiversity assessment in recent years. Despite widespread acceptance and application for conservation-related outcomes, the synergistic effects of hardware limitations, signal propagation, and environmental conditions on how far a signal can be reliably detected remain critically understudied. We quantified changes in signal detectability using Autonomous Recording Units (ARUs) in a tropical agroecosystem using playback experiments of standardised pure-tone (1-8 kHz) in fallow rice paddy fields. We deployed a four-ARU array and broadcast signals over a 50- 300 m distance gradient, and modelled operative detectability of signals using a binomial Generalised Linear Mixed-effects Model (GLMM). Our findings show that the detection space of an ARU is highly frequency-dependent and environmentally modulated. Detection probability for low-frequency signals (1 kHz) decreased rapidly (50% threshold at [~]100 m), whereas mid-range frequencies (4-6 kHz) occupied an acoustic window that remained reliably detectable up to 250 m. Higher relative humidity significantly enhanced overall detection, while increasing temperatures disproportionately reduced low-frequency detectability. The orientation of the ARU to the signal source was important as the detection probability declined from 81% for recorders facing the source (0{degrees}) to 14% for rear-facing units (180{degrees}). Our findings underscore the importance of determining the detection space before undertaking PAM. We propose a Decision Support Framework that provides a pathway for researchers to integrate focal taxa traits with technical constraints to determine detection space and optimise study designs when using PAM for monitoring biodiversity and assessing conservation action.
Perrin, S. W.; Adjei, K. P.; Mostert, P.; Togunov, R. R.; Herfindal, I.; Topper, J. P.; Grytnes, J.-A.; Chipperfield, J.; O'Hara, R. B.; Finstad, A. G.
Show abstract
AimA comprehensive understanding of the spatial distribution of biodiversity is hindered by fragmented datasets, sampling biases, and inconsistent observation protocols. Here, we present a workflow that integrates disparate datasets to produce large scale maps of biodiversity metrics as a basis for management-relevant information tools. We use integrated species distribution modeling (iSDM) to account for sampling biases and disparate data collection techniques, taking advantage of the vast numbers of open datasets available in data aggregators like GBIF. LocationNorway (excluding Svalbard and Jan Mayen) TaxonVascular plants MethodsThe workflow consists of four main steps: data acquisition, data integration, integrated species distribution modelling (iSDM), and the production of derived outputs. Input data include structured surveys, opportunistic observations, and environmental covariates. These are standardised and integrated into a point-processed based iSDM framework to produce species richness maps, associated uncertainties, and sampling effort maps. The outputs are further processed to identify biodiversity hotspots or to summarise species-environment relationships. The workflow used vascular plant data from Norway, combining occurrence-only and presence-absence datasets with environmental covariates. Outputs were generated at a spatial resolution of 500 x 500 meters, balancing accuracy, computational feasibility and relevance for management decisions. High-performance computing resources were utilized for model fitting and predictions. A subset of available data was used to validate the species richness maps. ResultsWe produced detailed maps of species richness, uncertainties and sampling intensity across Norways heterogeneous landscape, incorporating 1218 species in our final results. The species richness patterns highlight patterns consistent with previous mapping efforts. Validation showed an increase in model accuracy when compared to models which did not use an iSDM framework. The workflow highlights limitations in the infrastructure of the currently openly accessible data, particularly the need for more structured presence-absence datasets and standardized metadata. Main conclusionsThis study underscores the potential of workflows that integrate disparate datasets for biodiversity modeling. To maximize accuracy and utility, future efforts should focus on improving data standardization, the publication and collection of more structured data, and fostering data-sharing collaborations. Advances in the workflow itself, including optimising modelling covariates and integrating more comprehensive spatio-temporal aspects, will also increase the relevance of the outputs. These advances will increase our ability to estimate species richness with a precision and accuracy that can reliably inform conservation and management decisions.
Looker, J.; Rock, K. S.; Dyson, L.
Show abstract
Infectious disease time series often show signs of epidemic transitions, such as the peaks and troughs of the time series. In these time series, key system parameters can lead to catastrophic changes in the dynamical system behaviour (often called critical transitions). Modellers have increasingly shown that early warning signals can anticipate these transitions, both critical and non-critical, in infectious disease time series. Existing methods, however, generally focus on univariate time series data, or ignore spatiotemporal patterns that may be present as a disease spreads through a population. Recent ecological literature developments expand existing temporal and spatial methods to consider the covariance matrix of multiple, related time series. However, many of these proposed signals still make an assumption of stationary time series/system equilibrium. Whilst often true in ecological modelling, disease systems are seldom at equilibrium. In this paper, we propose the usage of the eigendecomposition of the non-stationary covariance matrix as a more suitable early warning signal for epidemiological data. We first analyse the expected trends in the eigenvalues and eigenbasis of the covariance matrix on approach to a transition. Next we apply these methods to a spatially-structured susceptible-infectious-recovered model to explore how the eigenbasis may provide extra information to modellers. Finally, we test these methods on SARS-CoV-2 case data during the 2020-2021 pandemic period in England.
Messick, H.; Lichtenberg, E. M.
Show abstract
QuestionsEcological monitoring, repeated collection of ecological data, is essential to document how ecosystems respond to change. In grasslands, different vegetation monitoring protocols are used across disciplines, making it difficult to address multiple management objectives or research questions. We asked four questions about how three common vegetation monitoring protocols compare. (1) How do the protocols differ in how they collect data? (2) How do the protocols differ in their utility? (3) In what ways do vegetation measurements quantitatively differ across protocols? (4) What are each protocols strengths? LocationThis study was conducted on working ranches in the Southern Great Plains with vegetation consisting mainly of native forbs and grasses. MethodsWe implemented three protocols at each site: (1) the Rangeland Analysis Platform (RAP), (2) the Grassland Effectiveness Monitoring (GEM) protocol, and (3) a typical pollinator ecology survey protocol. We qualitatively compared each protocols utility and quantitatively compared cover measurements that each produced. ResultsAll three protocols displayed positive associations within cover categories, but differed in actual cover measurements. The RAP protocol, which uses remote sensing, measured the highest total vegetation cover. The GEM protocol, a line-point intercept method, had more capability to capture fine-scale cover patterns. The GEM protocol measured the most bare ground while the Pollinator protocol measured more forb coverage. ConclusionFine-scale methods like the GEM protocol are most appropriate to address objectives that require capturing small patterns that would otherwise be overlooked with methods like quadrats or remote sensing. Remote sensing is advantageous when monitoring large areas or inaccessible land, but may over-estimate cover. The Pollinator protocol is best equipped to address questions regarding flower abundance and richness. Similarities among protocols can facilitate synergy across disciplines for more effective monitoring. We emphasize the importance of denoting a clear scale and scope of monitoring objectives before selecting methods.
Ardila-Villamizar, M.; De Clippele, L. H.; Dominoni, D. M.
Show abstract
Convolutional Neural Networks (CNNs) have become increasingly prominent in biodiversity monitoring due to their strong performance in accurately detecting species from sound recordings, overcoming some limitations of traditional methods such as point-counts. Yet, their use in urban ecosystems remains limited, highlighting the need for frameworks that identify modelling strategies to optimize their performance in these complex soundscapes. Here, we evaluated how preprocessing and labelling strategies, detection thresholds, sample size, and architecture affect the performance of CNNs for bird identification in urban tropical ecosystems. We also assessed its potential by comparing CNN-derived biodiversity estimates with those from point-counts and acoustic indices. For this, we used one week of recordings collected along urbanization gradients in five Colombian Andes cities to developed 11 multiclass CNN models varying in spectral representation, labelling strategies, training data source and backbone architecture. The best-performing model, evaluated with F1-scores, combined Log-Mel spectrograms, multispecies labels, ecosystem-specific recordings, a probability threshold of 0.3 and a ConvNeXt backbone with its performance generally improving with sample size. Although CNNs and point counts detected partially distinct assemblages, CNN-derived species richness was comparable to that estimated from point-counts. In addition, the Normalized Difference Soundscape Index (NDSI) was positively associated with richness, suggesting its potential as a biodiversity proxy in tropical urban soundscapes. Overall, by identifying effective modelling designs and monitoring strategies, our study advances the development of robust biodiversity assessment frameworks in urbanized ecosystems in the Neotropics whilst also providing methodological guidance for future research and practical insights for wildlife monitoring and conservation.
Rodriguez, L. K.; Schallhart, S.; Hobmeier, P.; Curran, T.; Perez-Jorge, S.; Prieto, R.; Oliveira, C.; Silva, M. A.; Thalinger, B.
Show abstract
O_LIEnvironmental DNA (eDNA) analyses have become a powerful tool for non-invasive biodiversity monitoring, yet the applicability of population genetic approaches to environmental samples remains largely unexplored. Even when genetic traces originate from a single individual, low target DNA concentrations and amplification or sequencing artefacts can compromise downstream genetic inferences. Here, we present a novel approach for obtaining demographic insights and lineage-level mitogenomic information from aquatic eDNA samples collected near vertebrate individuals. C_LIO_LIPaired eDNA and tissue samples were collected during sperm whale (Physeter macrocephalus) encounters in the Azores. Samples were screened for the presence of vertebrate eDNA and analyzed with a novel molecular sex identification assay. Additionally, long-range PCR was used to amplify up to five mitochondrial DNA fragments ([~]3-4k bp) before subsequent sequencing on an Oxford Nanopore Technologies platform. A stringent three-tier filtering framework capable of identifying true mitogenomic variation across eDNA samples was developed for maximum recovery of genetic diversity at the haplogroup level. By benchmarking eDNA samples via their paired tissues, parameter values were optimized to maximize concordance and minimize spurious variant calls. C_LIO_LISexing was successful for 50% of eDNA samples, with 96% concordance to paired tissues, and marine vertebrate DNA concentration significantly predicted sexing success. Further, Medaka polishing produced high identity mitochondrial consensus sequences (>16 kb) from eDNA samples. Across filtering regimes in the framework, curated SNP panels comprising up to 453 high-confidence mitochondrial SNPs resolved 19 haplogroups, with 93% concordance between eDNA and tissue samples. An intermediate bioinformatics filtering strategy maximized biologically accurate haplogroup recovery while minimizing sequencing artefacts, providing the most reliable lineage-level inferences. C_LIO_LIThis integrative approach demonstrates that targeted nuclear assays combined with long-range mitochondrial sequencing can recover individual-level genetic information from aquatic eDNA. By defining analytical thresholds governing success, the framework advances non-invasive genetic monitoring of populations via eDNA and enables population-level monitoring and conservation of endangered and genetically-vulnerable species. C_LI
Bhattarai, A.; Smith, J.; Abdelgaffar, H.; Carpenter, R.; Mishra, S.; Fuentes, J. L. J.; Shirsekar, G.
Show abstract
This protocol details the extraction of high-molecular-weight genomic DNA from grapevine tissues (wild and cultivated Vitis spp., including pathogen-infected samples) and the subsequent preparation of Illumina(R) whole-genome sequencing libraries using bead-bound Tn5 transposase. It is designed to overcome challenges from polyphenolic compounds and secondary metabolites in wild plants, providing a cost-effective workflow for large-scale population genomics. It includes recipes for buffers, incubation times, critical notes, and troubleshooting tips to maximize yield and library quality. Although designed for the grapevine DNA, this protocol is potentially applicable to other similar wild plant species HighlightsO_LIOptimized CTAB-PTB DNA extraction protocol for field-collected wild plant tissues. C_LIO_LIEffective removal of polyphenols and secondary metabolites associated with DNA using PTB. C_LIO_LICost-effective Illumina DNA Prep library preparation using bead-bound Tn5 transposase (Tagmentation). C_LIO_LIScalable workflow suitable for large-scale population genomics in Vitis species. C_LIO_LIValidated method for high-molecular-weight DNA and high-quality sequencing data. C_LI Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=195 SRC="FIGDIR/small/713680v1_ufig1.gif" ALT="Figure 1"> View larger version (31K): org.highwire.dtl.DTLVardef@b637d4org.highwire.dtl.DTLVardef@10b563aorg.highwire.dtl.DTLVardef@14a32caorg.highwire.dtl.DTLVardef@4c9577_HPS_FORMAT_FIGEXP M_FIG C_FIG
Akane, O.; Kawaguchi, Y. W.; Niwa, T.; Uno, Y.; Kuraku, S.
Show abstract
The effective management of threatened shark populations relies on accurate demographic data, particularly operational sex ratios. While sex identification in intact shark bodies is straightforward through the presence of external male organs, namely claspers, it remains impossible for processed fins in the illegal wildlife trade, early-stage embryos in breeding programs, or archived tissue fragments and blood samples where morphological traits are lost. Here, we present a robust molecular sexing framework leveraging recently identified sequences from shark sex chromosomes, consistently organized in the XY system, to our current knowledge. Our approach consists of two distinct methodologies tailored to the the current identification status of sex chromosome sequences in the target species. For the whale shark Rhincodon typus and the brownbanded bamboo shark Chiloscyllium punctatum, we employed end-point PCR assays targeting male-specific Y-linked markers. For the cloudy catshark Scyliorhinus torazame, we developed a quantitative PCR (qPCR) assay targeting differential X chromosome dosage. In this dosage-based system, females (XX) are distinguished by an amplification profile approximately one cycle earlier than males (XY). By integrating X-linked dosage quantification, our framework provides a critical internal control that significantly enhances reliability, allowing researchers to distinguish true females from PCR failures. This toolkit offers a versatile solution for diverse applications, ranging from the study of sex determination mechanisms in pre-phenotypic embryos to the reconstruction of sex ratios from space-constrained tissue archives and global wildlife forensics, thereby contributing to the comprehensive conservation of shark biodiversity.